# Data
library(diversedata) # Diverse Data Hub datasets
# Core libraries
library(tidyverse)
library(lubridate)
# Spatial & mapping
library(sf)
library(terra)
library(ggmap)
library(ggspatial)
library(maptiles)
library(leaflet)
library(leaflet.extras)
# Visualization & color
library(viridis)
# Tables & reporting
library(gt)
library(kableExtra)
# Modeling & interpretation
library(marginaleffects)
library(broom) Key Features of the Dataset
Each row represents a single wildfire incident and includes information such as:
temperature – The recorded air temperature (°C) at or near the fire location; higher temperatures often increase fire intensity and spread.
wind_speed – Speed of wind (km/h) during the fire; stronger winds can accelerate fire spread and complicate suppression.
relative_humidity – The percentage of moisture in the air; lower humidity typically increases fire risk by drying out vegetation.
fire_spread_rate – The rate at which the fire expanded (e.g., hectares/hour); reflects the fire’s growth dynamics.
fire_type – Classification of fire behavior (e.g., surface, crown); influences how fires are managed and controlled.
fuel_type – The dominant type of vegetation or material burned (e.g., grass, timber); determines fire intensity and burn characteristics.
ia_access – Indicator of how easily suppression crews could access the fire location; limited access can delay response.
latitude – Geographic latitude coordinate of the fire’s origin; used for spatial analysis and regional modeling.
longitude – Geographic longitude coordinate of the fire’s origin; used alongside latitude for location-specific insights.
Purpose and Use Cases
This dataset is designed to support analysis of:
Factors contributing to the spread, intensity, and size of wildfires
The impact of weather conditions and fuel types on fire behavior
Geographic and seasonal patterns in wildfire occurrence
The effectiveness and timeliness of initial suppression efforts
Relationships between fire causes, detection methods, and responsible parties
Case Study
Objective
Large wildfires pose serious environmental, social, and economic challenges, especially as climate conditions become more extreme. Identifying the key environmental and human factors linked to these fires can help guide more effective prevention and response strategies.
So, our main question is:
Can we identify the environmental and human factors most associated with large wildfires?
According to Natural Resources Canada, wildfires exceeding 200 hectares in final size are classified as “large fires.” While these fires represent a small percentage of all wildfires, they account for the majority of the total area burned annually.
The goal is to explore potential predictors of fire size, such as weather, fire cause, and detection method, and provide insights that could inform early interventions and resource planning.
Analysis
Loading Libraries
1. Data Cleaning & Processing
- Converted fire size to numeric
- Created a binary variable
large_fire(TRUE if >200 ha) - Filtered out incomplete records
# Reading Data
wildfire_data <- wildfire
# Clean and prepare base data
wildfire_clean <- wildfire_data |>
filter(!is.na(assessment_hectares), assessment_hectares > 0) |>
mutate(
large_fire = current_size > 200,
true_cause = as.factor(true_cause),
detection_agent_type = as.factor(detection_agent_type),
temperature = as.numeric(temperature),
wind_speed = as.numeric(wind_speed)
)
# Drop unused levels for modeling
wildfire_clean <- wildfire_clean |>
filter(!is.na(true_cause), !is.na(detection_agent_type)) |>
mutate(
true_cause = droplevels(true_cause),
detection_agent_type = droplevels(detection_agent_type)
)2. Exploratory Data Analysis
Map of Wildfire Size and Location in Alberta
This interactive map displays the geographic distribution and relative size of wildfires across Alberta, using red circles sized by fire area. Each point represents a wildfire event, with larger circles indicating more extensive burns. The map reveals regions with concentrated wildfire activity and visually emphasizes differences in fire magnitude across the province.
Note
To provide geographic context for our wildfire data, we added a shapefile representing Alberta’s boundaries.
This shapefile was sourced from the Alberta Government Open Data Portal and specifically corresponds to the Electoral Division Shapefile (Bill 33, 2017).
The data was processed and transformed to the appropriate geographic coordinate system to enable mapping alongside our wildfire dataset.
# map
leaflet() |>
addProviderTiles("CartoDB.Positron") |>
setView(lng = -115, lat = 55, zoom = 5.5) |>
addPolygons(data = alberta_shape,
color = "#CCCCCC",
weight = 0.5,
fillOpacity = 0.02,
group = "Alberta Boundaries") |>
addCircles(data = wildfire_sf,
radius = ~sqrt(current_size) * 30,
fillOpacity = 0.6,
color = "red",
stroke = FALSE,
group = "Wildfires") |>
addLayersControl(overlayGroups = c("Alberta Boundaries", "Wildfires"),
options = layersControlOptions(collapsed = FALSE)) |>
addLegend(position = "bottomright",
title = "Wildfire Size (approx.)",
colors = "red",
labels = "Larger = Bigger fire")